268 research outputs found

    Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing

    Full text link
    Nonnegative matrix factorization (NMF) has become a very popular technique in machine learning because it automatically extracts meaningful features through a sparse and part-based representation. However, NMF has the drawback of being highly ill-posed, that is, there typically exist many different but equivalent factorizations. In this paper, we introduce a completely new way to obtaining more well-posed NMF problems whose solutions are sparser. Our technique is based on the preprocessing of the nonnegative input data matrix, and relies on the theory of M-matrices and the geometric interpretation of NMF. This approach provably leads to optimal and sparse solutions under the separability assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices, makes the number of exact factorizations finite. We illustrate the effectiveness of our technique on several image datasets.Comment: 34 pages, 11 figure

    Robustness Analysis of Hottopixx, a Linear Programming Model for Factoring Nonnegative Matrices

    Full text link
    Although nonnegative matrix factorization (NMF) is NP-hard in general, it has been shown very recently that it is tractable under the assumption that the input nonnegative data matrix is close to being separable (separability requires that all columns of the input matrix belongs to the cone spanned by a small subset of these columns). Since then, several algorithms have been designed to handle this subclass of NMF problems. In particular, Bittorf, Recht, R\'e and Tropp (`Factoring nonnegative matrices with linear programs', NIPS 2012) proposed a linear programming model, referred to as Hottopixx. In this paper, we provide a new and more general robustness analysis of their method. In particular, we design a provably more robust variant using a post-processing strategy which allows us to deal with duplicates and near duplicates in the dataset.Comment: 23 pages; new numerical results; Comparison with Arora et al.; Accepted in SIAM J. Mat. Anal. App

    Generalized Separable Nonnegative Matrix Factorization

    Full text link
    Nonnegative matrix factorization (NMF) is a linear dimensionality technique for nonnegative data with applications such as image analysis, text mining, audio source separation and hyperspectral unmixing. Given a data matrix MM and a factorization rank rr, NMF looks for a nonnegative matrix WW with rr columns and a nonnegative matrix HH with rr rows such that M≈WHM \approx WH. NMF is NP-hard to solve in general. However, it can be computed efficiently under the separability assumption which requires that the basis vectors appear as data points, that is, that there exists an index set K\mathcal{K} such that W=M(:,K)W = M(:,\mathcal{K}). In this paper, we generalize the separability assumption: We only require that for each rank-one factor W(:,k)H(k,:)W(:,k)H(k,:) for k=1,2,…,rk=1,2,\dots,r, either W(:,k)=M(:,j)W(:,k) = M(:,j) for some jj or H(k,:)=M(i,:)H(k,:) = M(i,:) for some ii. We refer to the corresponding problem as generalized separable NMF (GS-NMF). We discuss some properties of GS-NMF and propose a convex optimization model which we solve using a fast gradient method. We also propose a heuristic algorithm inspired by the successive projection algorithm. To verify the effectiveness of our methods, we compare them with several state-of-the-art separable NMF algorithms on synthetic, document and image data sets.Comment: 31 pages, 12 figures, 4 tables. We have added discussions about the identifiability of the model, we have modified the first synthetic experiment, we have clarified some aspects of the contributio

    Sequential Dimensionality Reduction for Extracting Localized Features

    Full text link
    Linear dimensionality reduction techniques are powerful tools for image analysis as they allow the identification of important features in a data set. In particular, nonnegative matrix factorization (NMF) has become very popular as it is able to extract sparse, localized and easily interpretable features by imposing an additive combination of nonnegative basis elements. Nonnegative matrix underapproximation (NMU) is a closely related technique that has the advantage to identify features sequentially. In this paper, we propose a variant of NMU that is particularly well suited for image analysis as it incorporates the spatial information, that is, it takes into account the fact that neighboring pixels are more likely to be contained in the same features, and favors the extraction of localized features by looking for sparse basis elements. We show that our new approach competes favorably with comparable state-of-the-art techniques on synthetic, facial and hyperspectral image data sets.Comment: 24 pages, 12 figures. New numerical experiments on synthetic data sets, discussion about the convergenc

    A Fast Gradient Method for Nonnegative Sparse Regression with Self Dictionary

    Full text link
    A nonnegative matrix factorization (NMF) can be computed efficiently under the separability assumption, which asserts that all the columns of the given input data matrix belong to the cone generated by a (small) subset of them. The provably most robust methods to identify these conic basis columns are based on nonnegative sparse regression and self dictionaries, and require the solution of large-scale convex optimization problems. In this paper we study a particular nonnegative sparse regression model with self dictionary. As opposed to previously proposed models, this model yields a smooth optimization problem where the sparsity is enforced through linear constraints. We show that the Euclidean projection on the polyhedron defined by these constraints can be computed efficiently, and propose a fast gradient method to solve our model. We compare our algorithm with several state-of-the-art methods on synthetic data sets and real-world hyperspectral images
    • …
    corecore